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ABSTRACT 

The available objective evidence suggests that the 
accuracy of predicting which students will succeed in a particular 
graduate school is often no better than modest, especially if such 
predictions are baised only upon a test or a grade record* Taken 
together these two types of predictors do a reasonably good job, 
considering the restricted range of ability involved. The best way to 
improve selection of graduate students will be to develop improved 
criteria of success. This is no small job for graduate faculties, bur 
it carries the promise of more effective utilization of talent and 
greater assurance of equity in admitting students to advanced levels 
of training and the privilege associated with such programs. 
(Author) 
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;S Warrvn. Y/. Willinfihani' ^--eoc ^o^secEs.^^^^^^^^^ 

f'^l In rcLeni dec^jde.^ fzrculraitu schools have Ovssumml a major respon-- 

sibilily /or the advanc ed traiinnr/ of a talented segmoiit of Amorkan society. 
Compared with Icwcj* /or:ns of Kcliooling, most grnd\iaie programs arc curi- 
]y as well as inlollce tually dcmandin?i. Students who complete the^e pro- 
grams feed the professions and academic disciplines and constitute a crii- 
ical national resource. Tradilionally mos( j;.radvU^ie students have been 
selected v/itli great care but uni;i t})e past decade or so there were rela- 
tively lew formal statistical studies of thrt selection procoss. Such inves-- 
ti<^ations are now common. 

There are several possible explanations for recent inl.-rcsl in pre- 
diction studies of success in graduate education. In earlier times sp'ice 
in graduate schools and the number of applicants v/ere in a rou[^^li eqailib- 
riura but burp.eonin^ applicant groups in the fiftics^id sixties focused at- 
tention on how some were selected and ot])ers turned av/ay. 'f^jieso larjjer 
numbers of students permitted statistical sUidles in many departments 
v/hich 3;revjoasly had too fev/ stnderts to m^^kc this ty]^e of systen^;itic eval- 
u^tirn v'^^r'^h^'^-hil'^^- Finally, iucrnfi.sinfr i^^^e of selectuni tests (Graduate 
Record candidates increased from 100,000 to ^.oO, GoO doj in^A ti'^'^-i l/oO s) 
suggested the prediction studies v/ith which similar tests arc closely asso- 
ciated at the underj,r?duate level. The purpose of Diif- report is to sum- 
maiizo the results of the substantia] nuxnber of such studies that have ac- 
cumulalcd, to supj^cct some practical implications for selecting [graduate 
students, and to indicate further research needed on the topic. 

Correlational analysis is the principal research design for eval- 
uating the selection process. One or more p red i ctor. s (measures of stu- 
dent potential) are evaluated Vs^th respect to the extent to which they ac- 
curately forecast one or more crit eria (measures of student success typ- 
' ically taken after a year or more in graduate scliool). The value of a pre- 
dictor for selecting students varies directly with the si^^e of its correlation 
with the criterion (Cronbach, 1971). This correlation, called a validiH* 
coefficient, ranges, from a chance relationship of ♦ 00 to a perfect 



vAn eai*licr version ol this paper was presented ax a )>rcsession of tlie 
Council of Graduate Schools meeting in November, 1972. The author 
expresses appreciation to Jane Porter for assistance in compiling the 
data on which this report is based. 
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relatifjUsJiip oT ). 00, Ihonj'li m y/t±ii\ c cu^^ffii ienl:-: Ct'Mi nc^-wv rinc* pl i foci 
validity i.s not closely vyjpyo'.ici'^ii] i:\ r)'racUc(^. lJsu;d]y mon: fliao <-rc 
prcnictor J.s involved (c. j'^. , :l tcsi anri a j.',ra(]f- axcra^t:) a^^l iji hiu h 
ca.sos a sL^ilir.^t xcally v/ci<.'hlcd ci»;)jpo^-]to of llic* prcdiclorf is typic<».]]y 
rvorc useful for sclortion uux'j^osc^ WiP.n either predictor alone. 

There are a variety of measures that can be used 'as predictors 
of success; tliere are al;;0 variou.s ineasures that can serve at- criteria 
after adinisjsion to ifvixdiv^ia' school. Moae are entirely .^-afisfac to ry , 
Any predictor or critci'ion should liave reasonable construct vaKdily, 
reliability, and acci.'pf ^ bilil y* I';y cc^nj^.truct vaiidity v/o je.ean tl\al 1're 
prcjdictor or <-rit^rion i liould be rel'/Vont to v/ljat v;e ijitc nd to )r( a:.ure\ 
More: S])ecifical1y, it fdiouid represent -.vliat we want to nieasxire, all v/e 
want to HK^iisure, and not]iinj.j but lhat v/hicl) wc want to Ti:easure 
(Thorndikc a,nd liapen, 1969). By re]ia!>i]ity v/e mean that a measure 
provides a stable cstini^te from one* iiieasuring occe^sion to anotlni ziiid 
is free from distortion, i^y acceptability, we nieaii tliat u n^.easure is 
ceonornically feafoble, axlrninistratively practical and socially elhicaj. 
It is in this context of construct validity, rellabnily, and acccplaiji lity 
that we can review briefly some of ihc strenj>tl'is and weiil-.nesses of 
predictors and criteria before addressing; tJie empirical and utilJiarian 
question of the relationship betv/een the two. 

Pre dictor s 

Underp,racluate Grade Point Averap^c. The; studeiit^s under^^railuat e 
average has obvious relevance as a predictor because it represents the 
same sort of behavior one: is tryinj^^ to forecast. It is a mottcr of psycho-- 
metric, as v/eJl as everyday experience tha( past behavun* js usually the 
best predictor of future behavior. Undergraduate GPA is readily avail- 
able, widely assumed to be fair and equitable, and almost univer.^ally 
used by graduate departments in selecting students. The measure has 
two important weaknesses* It often ha.s a narrow range - from 3. 0 to 
4. 0 in many departmental candidate groups and thus doesnH differentiate 
ap])licants very welL Also the meaning of a P> average varies consider- 
ably from one undergraduate college to another. 

Recom mend ations. References from undergraduate professors 
are widely used d<jspite the fact thai ihey are time cons\niiing to prepare 
and sometimes difficult to quantify or even interpret. Recommendations 
can be highly relevant, particularly in the stinse that an informed person 
can judge a studcnt^s suitability for a particular graduate, jn-ogram. ]n 
many situations the Achilles heel of personnl rcfcrenc<:s is the unrelia- 
bility or lack of comparal)ility among judges* 
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.sevent)»^?.Tnn. 1 of j-C'St r- L aiui ihf-oretirril <i.'vclo]^inoni. On l!u one 
hand this work \\'\: prcuhircci rcliaMh-, 'Mnnddsvl ij^o:i?ures hlj^^-ly suil/vl'lc^- ^ 
for national acl!niiji';ti-al)on iinder s ^ciu'itv condilions. ^Torc importi^nlly, 
tins work ha.^- dcvjlupcci dcl.iiK cl conceptual f^-cn ic\vo.'k53 of }i\nnan abilities 
and established rc Nation i^hipr. belwccT i::idcr]yinj; a))ir3t3e:4 and .socially vX^l- 
ued obscrvablo hoi aviur su'Ji rs bchob^slk: coirjpclonce. Diu; to this }^^y- 
chornelric devc-loj>j n^nt it 11; ]^vj/sib]e lo construct tcj4s to rnr asuro any t>f 
an cxlrcnicly wick - an[;c of liunian ahi'iitios, A corrc ^5pondii:.<', weakness 
of i-no.sl indivi\U:al 1( 1^o\/ -v'cr, is their tendency to focvu- attention on 

fairly linntcd I'spect.s of coMpeiuncy, Anoth .^r weulvnes^' in tl^ Jin^^criiij; 
sus]:>icion (despite MibstanticJ cvide'iCe lo the conlrnj'y; 5;cc Ij'un, IV 73, 
Stanley, 1971) tJiai btauda rdi/.^Kl tests ai 3ntr^n^. Ically biased a<^.ain.ot 
individvKils ii'oin cviltiral ininoritie.s* 

r>5o<.':raphical ^^'-f^^'ipf-^-Ji^* Various, cliarac le.rij^tics of an applicant* 
..background arc n.sod implicitly if not iorr..:illy \v];en ju'^dur^te dep.irt nier.i s 
^ select students. Both the Ktrcni;ths and weaknesses, of backj^ruanu inforiTia 
tion lie in its construct validity or relevance to >;uccesi:; in lh(; gradxiale ]>rc 
gram. Special accompli .slnnentf^ or orpLricncc of .students can be hl::hly 
relevant. Characteristic 5i t,uch ar ar^c, sex, or race may be quiJ^c; irrel- 
evant b\it nonetheless used for legitir.-iate social or ..'Jriiini^:trat iv-i; re^.-- 
sons. On the other hand some particular characteristic of n])j)l.ieantJ5 may 
be easy to collect but Ireachevous to use in selection dccif.ioii:s if there io 
no loj^ical and defensible explanation for an observed relation; l^ip betv/een 
that characteristjc and s\iccess in graduate school. 

Criteria 

Graduate Grade Point Average. The grades a student makes in 
graduate school are a readily available and certainly relevant ' dici'tion 
of success. But traditionally, grades in many graduate schools have 
consisted largely of A's and B's. Thus the range is so narrow that dif- 
ferences among CPA's do not usually represeiit reliable differences in 
student accomplishment. Furthermore, many faculty members doubt 
even that reliable grades represent the most important outcomes of 
graduate education. 

Comprch en sive Examination. Many dcpa rtments require students 
to pass a qualifying or comi>rehensive examinat-ion cit some point during 
their graduate program. In theory, a properly c(Hi5/cructcd exan^.inntion 
could i^rovide the most reliable arid \*alid criterion of subje<:t competence. 
In practice such examinations likely vary widely in quality. In any event 
an objective examination should not serve as a sole criterion since it 
measures a limited aspect of success. Furthermore it suggests a logical 
circularity v/hen tost scores are used mainly to predict test scores. 



£?^i'^^y ^'i}JJj^^>£' -''^ pj-ih'-ijxii advantiip,^-' of collcciive faculty jut-f,- 
meni is ver.sHlUiiy in mcuiiuriDj/, l!n])orta/ii ris])rcls of <;r.uhiatc succor^^- 
other ilxnn hhnv/]oA\.\o of Mil)jc»cl fjold. Faculty v/lio ];]:io\.' a siudcnl \/cll 
are in llic best i^o.sition fc b^y wiiciliCir he ib able to excicutc' jiidcpondcJit 
research or niolivaUr a c):xi,s of iiiKU'ri;,rad\ialc.s, A \vca];n<.^r»s is tiic fact 
thai n^any faculty ratings arc isn-elinble and not carefully designed to rep- 
resent ob.^ervable oulcoj.iob of j^radiialc trahiin<j. 

Attain 1 *h> D. R(M:;ardlci;s of wliat Cnl;er judgnienl s a faculty may 
ma];e of a dc^l' /a] r^hi U-nl {ho c.5ur<jf.dfy cojif^idoi'cd tm. Id 1ej;i is \v]ic-l}it.r 
he or sjie is p,l^tn1ed thf» dei:rco. C'onfjcquenlly, tlji?; i:^ p-^-obahly tiic .sia- 
gle most dcfen.siblc a?id 3-elcvant criterion of .sncccs.s at tl.c l^h. ]). lovoL 
On iJie ne|.»etlive side, oth* nuist waU ;i }uiii\ i:n»c:for tlii.s crite^'ion. In 
fact years lapf.cJ Ix.-fv/con BA and I'-ii. D. attaiiiMUint is a corollary crilc- 
rioji used hi some sividies. Another c^jffjculty ir. the fact that whctlmr a 
student gra,duates may frcquenlly depend upon oxLrancous influences, r<Mher 
than demon.^1 rated compefence, Jn l uy event ihit criterion places a jn'O- 
miuni on eicaderoic persistence and j)rubab]y docs noi diffc-r entiate \ery 
v/ell the most prorraising scholars; and j:>rofessio}i' Js. 



Nctlu^ e of tiie iyuUi 

By far the most common predictors used in studies of success in 
graduate school are undergraduate average and Graduate Pvccord Excim- 
inations scores, 'Jliis review is based upon correlation :'iulics using 
these variables that v/ere cited by I.ynnholm (1968, 197?.) or located 
through searclies of ap])ropriatc journals and abstracts, ^'orty-three 
studies were found for the period ]^52-]^)Til tho\igh aborif half were dated 
during the last five years of that span (see b'st appended). I'lalf of tJjcse 
studies were published; the remainder were institiitional imports or ilieses. 

The 43 studies included 138 independent, jets of data, usually cor- 
resj^onditig to departments though occasionally representing some broader 
group such as first year students across several departments. Individual 
sets of data were based upon 20 to 1479 students (Median N 80). The total 
number of students included in all studies was 2], 214, The total number of 
validity coefficients was 616. These coefficients are stmimarized in Table 1. 

The first t-wo predictors in Table 1 refer to the A^erbal and Quan- 
titative sections of the GRE Aptitude Test. 'j1ie GRE Advanced 'J*est (eval- 
uates achievement in the student's cliosen field; thus the content varies, 
depending upon the department involved* The fourth predictor varied 
somewhat from stttdy to study. It was usually Uie average oTTwo or three 
GRE scores though this composite wa$ occasionally weigltted statistically. 



The •:r;^!fir}:;radu:ilo CPA wri undoubtedly rornj^utid in varlt^ rj ways in (Hf- 
{avi-rJ : U}(lie.-: but ^.cldonj s])v»r ific-c* vory cart'fully. The Hr.tM roiircrMinj* 
recornniendat ion.s came rOi^Posf c>:c3ur;ivcly Irom Ihrcu* exli:Mf>ive .studic-j 
of N;jiional .ScioJicc roiiuualion follov/.slnp applicants (Croai'c r, 19^^*^; Hoc:!: 
and li.'j-rTnon, and roprcscnl lij^' avcraj^c ratinj^; of s-^cvora] lettcj-r. of 

rcfe] (,nc c. 

With respect to critccria of s\iccc.s.s, tlic exact natiirf of facidty x-at- 
ingr; varied from .study to r->ludy but ij^'pically repj'o j-enf ed the con-jjjx^s iu* 
judpi.'.' ,'it.s ol Several faculty n^jc^rribers c o<]ccrriliij prui't* ?^ . ioi.o 1 pro;ni:,i* or 
ovcralj iaiccej.s as a ^>radualc student. Very few .stv;d:o.s reported \'a1i(lJty 
data v.'lW'i depart mental exanxf; as the crilerion. ''Attaiij Y). typi^/alW 
]r>eatj:, attaininp Hie degree v.'ithin a certain rjun^ber of yea_rs, r,o <\ liuic 
element is also involved. Tliat fa^.or is form. -^izcd in ihv "titjie ia l^ii. D," 
criterion by a ssij^ning criterion sc^v-f to f^tudcnih arcoidij.f', to years 
elap;:ed betv/een ]jA and Ph* D, All of the data concerning lliis last ci'ite- 
rion comes from two very large studies by Crea;?,er (iy6*S), 

-I^'^^'^Mf ^^'*^^^i^y G radu al e Success 

The 5,:1-uciies represented in I'able i vary widely in qvaliry a?^d scope. 
Some are based on .small .samples, making individual correlations imreli- 
able, )3ut those medians based on more than just a few coefficients rhould 
give a dependable idea of how vc.lid these prcdictf»]*s are and how predic- 
table are the various criteria of graduate success. Insofar as postiblc* 
the same data have been sorted by major field and presented in 7'al:)le Z to 
illur;frate differential validity of tlie predictors for different discij)]i':es, 
Sev(n-aJ observations can be made from these tables* 

Validity coefficients for the various predictors and compo.site^ 
(against the, CPA criterion) lend to be about . l^i lov^cr than corresponding 
median coefficients at the undergraduate level (Fishman and Pasanella, 
1960). 

The undergraduate GPA is a irjoderately good predictor of graduate 
CPA and faculty ratings; it is a poor predictor of whetlier a student will 
attain tiic Ph. D* Depending upon the success criterion used, the 
composite is either slightly more valid or subsiantially more valid than 
the undergraduate GPA. 

The GRE Quantitative is typically a bettm* predictor in thof-e^ .sci- 
entifie fields where quantitative ability counts n-!ost. The reversal in the 
field of mathematics may be due to restriction in the range of quantitative 
scores because of heavy empha.^is on this variable in selection. 
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Correspondingly, the GHE VerL.;l tcncls to be more valid in verbally ori- 
ented discipUne.s such a.s ^iiplisli and education. OLhcrv/ir,e Ihe p^iMcrn of 
validity coefficients is fa^rl/ iiimxhxr from c li discipline 1o the next. 

Tlie GRE Advanced is evidently < mof^i generally valid predictor 
among tho.se included. In seven of the nine di.sciplines in 'i'ablc 2 it ha.s 
the highe^it validity among ^ho iiirec GRE scores. In ei[^hL of the nihe 
fields it has higher validity than undergradnal e GPA, 

Recommcndrvtions appear to he a fairly poor predi" or of whether 
a student v/i}l successfully comijletc a doctoral program. 

The comprehensive departmental e?:amhiation seems a somewhat 
more predictable criterion than the others c>:amined here. This is an 
uncertain conclusion because the available data are sparse but the con- 
clusion is consistent v/ith the reasonable arsfiumption that such a criterion 
should be more reliable than tJje others repref^ented. 

A weighted composite ijicluding undergraduate GPA and one or 
more GRE scores typically provides a validity coefficient in the .40- 
*45 range for various criteria of success and for different academic 
fields. This is somewhat hiphcj* thari tlic validity of GRE alone* The 
coxriposite of xinuergraducitc GPA and GRE provides subiuaulially more 
accurate i^rediction than does undergraduate GPA alone. Hiis is the 
case for each success criterion and practically every academic disci- 
pline. 

The Utility of Gurreiit Predictions 

What overall evaluation can be made of the extent to which svic- 
cess in graduate school is predictable? Cronbach (1971) describes the 
following considerations in determining the utility of a predictor for 
selection purposes. First, the vitility of the predictor is directly related 
to the size of the correlation coefficient. Thus Table 2 indicates that in 
most fields the value of the GRE- GlVv composite prediction amounts to 
about 40% of the benefit that could accrue if prediction v/ere perfect. 
How xiseful that level of validity is in practical terms depends upon the 
cost of gathering the predictor information and tv/o other considerations. 
A small correlation can produce a large benefit if the proportion of stu- 
dents selected is low. Finally, a given validity coefficient will have 
mor,e practical value if the selection decision is important, and the se- 
lection decision is more important if it is irreversible. 



Wo might say that a validity- coefficient of * ZO is modest and one 
of .40 moderate. T!:e conditions of gradiiatc student selection are goner- 
ally favorable to usinr predictors of even modest validity. Tn nia?iy de- 
partments only a .smafll proportion of students are accepted; the decisions 
are q\iite important to the student and to broader interests; and tl\e deci- 
sions are typically irreversible. There seems little dotibt that tlie GRE 
and the undergraduate GPA are providing quite useful information in most 
situations. 

Figure 1 illu'^trates grapliically the level of benefit likely to accrue 
from using ^predictors that are valid to the extent indicated. Students at 
high ability levels v/cre far more likely lo attain the Ph.D. than those at 
low levels. The figure also illustrates that many students fail to attain 
the degree^ even among talented NSi^ fellowship applicants. And in these 
sami^les reported by Creager (1965) there v/ere substantial differences in 
attainment rates among fields (Chemistry 51%, Physics 36%, Psychology 
26%). 

It should be empha Sliced also that validity studies at particular 
schools and departments give va^rying results* Such variability is exac- 
erbated by the small samples often used, but real variations do occur. 
It is impurtant to undertake local sttidies in order to justify selection 
procedures and utilirse available information to ma>dm.um benefit. 

Can Predictions be IiTiproved? 

V/hat are the prospects of improving prediction of ^^sr^duate suc- 
cess? One cause for pessimism is the very restricted range of talent 
involved. Many of the studies simimarized here are based upon highly 
selective departments or groups like NSF fellowship applicants. For 
this reason alone one would expect substantially lower validity coefficients 
than arc typical at the undergraduate Icvxl. Consequently, it does net fol- 
low that the predictors are inherently any less valid. The GRE aptitude 
test, for example, is basically , similar to the less difficult and «'more 
valid«» Scholastic Aptitude Test. Judging from considerable research at 
the undergraduate level it seems unlikely that other types of aptitudes 
can enhance prediction to any significant degree. 

The undergraduate GPA suffers similar shortcomings as does the 
high school average in predicting sxiccess at the next educational level. 
The range of the grade average is greatly restricted by selection and the 
grade scale varies considerably depending upon the origin of the student. 
There have been many efforts to develop both simple and highly sophisti- 
cated methods of adjusting grade averages to correct for grading 



variations from school to r.chooL Linn^s (1966) review of the extensive 
work on this problem at the undergraduate level indicates that such 
adjustments result in little if any improvement in prediction beyond that 
offered by joint consideration of an aptitude test and tJ\e grade record. 
There has been only spotty v/ork on this problem at the graduate level 
and none of it suggests any different conclusion (e. g, , see Mohrabian, 
1969). There is some indication, however, that success in some grad- 
uate business schools is enhanced somewJiat by considering the quality 
of the undergraduate school (Pitcher and Schrader, 1972) 

Anyone v;ith long experience in selecting or training students in 
higher education is very inclined to plea,d for some v^'ay to measure 
student motival ion- through personality scales, interest inventories, 
background information or v,;hatever. There have been many pertinent 
studies at the undergraduate level and P'reeberg (1965) documents a' 
number of instances where such student self-report devices have made 
small but significant contributions to predicting grades. But Kendrick 
(1964) describes well the host of practical problems and elhical objec- 
tions that have inhibited formal use of such information in selection. 

A slightly different and perhaps more acceptable use of a motiva- 
tion measure v/ould be tor fh** onrnose of idcnHfvino- f^vrmo?:; s^prloni s 
who differ considerably in the extent to which success is predictable from 
traditional ability measures. Don Rock, v/ith support from the. Graduaite 
Record Examinations Board, is presently studying that possibility as an 
outgrov/th of earlier efforts to locate such moderating effects on the basis 
of age or quality of undergradviate school (Rock and Harmon, 1972). 

One might suppose that motivation to undertake graduate v/ork 
would be one important quality reflected in letters of recommendation, 
but the validity of such references is disappointingly low. In extensive 
studies of hiSF fellowship applicants, the reliability of single references 
was reported to be in the lev/ .30*s (Harmon, 1966). This may be the 
main reason why recommendations are poor predictors, but careful ef- 
forts to improve that reliability v/ith multiple ratings did not result in 
good validity for the NSF fellowship recommendations. Such results do 
not suggest that improved letters of reference are a promising possi- 
bility for increased accuracy of prediction. 

So mxich for predictors. What are the prospects for improving 
the c-riteria? Graduate point average is traditionally subject to varying 
interpretation and practices which tend to make it unreliable* In recent 
years graduate faculty seem even more dubious regax'ding the value of 
the CPA as a criterion. With different grading procedures the GPA 
could theoretically be a quite good criterion but there is little reason to 



expect that to happei? in the foreseeable future. Systematic faculty ratings 
of different aspects; of gradiuilc success seems to he a more feasible de-=- 
velopmcnt but there has. been limited theoretical rationale to guide yuch 
extension. The comyjrehcnsive departmental examination, if properly de- 
veloped, could very likely scr%^e as a highly reliable criterion but it would 
place primary emphasis upon that aspecl of success that is associated wiih 
content knowledge reproducible in a written test. 

In some respects »'Ph, D. attainment*^ (and its corollary "time to 
Ph.D.") is the most defensible criterion of those represented here. Not 
only docs it represent tlie fi!ial reality of nuccess; it also includo.s all 
those personal characteristics like ability, organization, and persistence 
that are normally considered necessary in the successful doctoral candi- 
date. Unfortunately the researcher must wait a long time for this criterion. 
It is seldom applied to the MA degree and may be n^uch less appropriate at 
that level, particularly in graduate departments with heavy emphasis on the 
Ph, D. A movQ. serious shortcoming of this criterion is the fa,ct that it is 
not easy to predict. There are similar types of behavior (e. g, , employee 
turnover or withdrawal from military flight training) v/liich are also de- 
pendent upon voluntary persistence. Such evei. > are notoriously difficult 
to predict accurately- -probably because lack of persistence may be due to 
a v/ide variety of independent contingencies, any one of which may be un- 
imporuuit lox* must people but critical lov a fcv/. 

We might sum up the preceding discussion as follows. There is 
no doubt that present predictors, taken together, are providing a useful 
means of reducing some of the guesswork in selecting graduate students* 
Nonetheless, attrition of able stiidents is disturbingly high. To the extent 
that attrition represents a mismatch of students and programs it is im- 
portant to improve the validity of selection procedures. 

Unfortunately the foregoing paragraphs do not present an optimis- 
tic picture of the possibilities for improving prediction of success in grad- 
uate school. There is no obvious v/ay to improve the validity of existing 
measures of student potential. From past experience there is little rea- 
son to expect that new measures will do a substantially better job of pre- 
dicting conventional criteria. Improving the focus and reliability of pre- 
sent criteria might well improve validity coefficients somewhat; it would 
not likely have miich effect on which students are selected {i. e. , one would 
still simply choose the students with the highest scores on the same pre- 
dictors). The main problem is that we operate almost exclusively with 
one prediction strategy dominated by the notion of scholastic aptitude. 
There are, hov/evcr, alternate prediction strategics that suggest addi- 
tional predictors and additional criteria. 



Alterna t o Prsdiclion Pi r ate p ie r 

It is v/cll lvno\¥a lhat thorc are Irainlnj^ objectives in graduate, educa- 
tion (e.g., indcpc3idc:ni .scholuj Mup) not cxplicilly i*cprcscnted ni conven- 
tional criteria, and tlierc are Important .student abiliiies not represented 
among traditional selection mea.'jures (e.g. , creative potential). In gen- 
eral, many training programs are c!iarac{erh-,ed by multiple criteria of 
success v/liich may not be hij^hly related to one another and rnay depend 
upon different abilities* Or a .s may be more likely in graduate education, 
one department may en-iphasize one set of objectivcG while another depart- 
ment in the same discipUnc may stres:.s olher-outcojrjes* 

It may be ea&ier to appreciate multiple criteria of ^nccess v/hen 
examining actual job performance. There lias been relatively little re- 
search on the relationship of perfo3'mance in graduate school to later 
professional success but one elaborate study by Creager and Harmon 
(1966) includes the same predictors exavnincd in this review. 

The median validity coefficients for ORE Advanced, Recommen- 
dations, and Undergraduate GPA v>^cre as follov/s for three on-the-job 
criteria- --rating of scientific knowledge: .27, .23, ♦ 29; income: *11, 
-•05, . 03; citations of the individual's publications: . 28, .07, . 12. The 
stuuy invuiveu sju?i.teen liuuoi otuct^iii,^* ^ti *yOvwn aav^Iv**.* ..^^ va*^wv> 
corelations can be assumed to be fairly stable. It is evident that each 
predictor has a modest correlation with a later rating of scientific knowl« 
edge, no predictor is related to income, and only the GRE Advanced pre- 
dicts scholarly citations. (The authors report the latter to be a very prom- 
ising measure since acctmixilated yearly data should provide a more reliable 
index than that available for this study. ) 

These limited data suggest that different predictors (or, in the case 
of income, no predictors) are relevant to different criteria. There are 
many quite defensible criteria of professional success: eminence as a 
scholar, teaching skill, professional leadership, etc. It is preferable 
to develop such criteria in the actual job situation, but for most practical 
purposes this v/ould require prohibitive time and expense. But it is pos- 
sible and highly desirable to use a stepping stone procedure by developing 
better intermediate criteria that can be measured during graduate training. 

Students exhibit many forms of incipient professional behavior in 
graduate school, though we typically make little effort to evaluate such 
behavior in relation to selection procedures and training objectives. It 
can be useful to develop alternate prediction strategies that reflect the 
reality of varied training objectives. Figure 2 illustrates the intended 
connections in the case of three possible program objectives: to train the 
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practitioner, teacher, or scholar / scit-nti.M . Of course Iho criteria of 
success for a practltionQV \vill vary fi-om fii-kl to fiekl, particiilarly if 
professional schools are coufjicJured. 

Figvire 2 speaks mostly for itself. It implies ihat^ciifferent de- 
partments or programs v.-ilhin departments may eniphasi'/.e different 
training objectiv-/. which, in turn, should he related to the way students 
are selected and tljc way their performance is evaluated. It U; assumed 
that academic comnetency in the subject field is always an important 
success criterion, but beyond that, dijier<;nt training objectives imply 
multiple and often differei'.i ct ileria. 

The first requirement in oi)enino the possibility of alternate rnodels 
of selection-training-evaluation is developinent of the necessary criteria. 
More than likely these would need to l>e specially constructed and then 
combined into a composite to be predicted by an appropriate combination 
of predictors. Developing the criterion components might involve faculty 
ratings of different student competencies, a common examination of sub- 
ject matter competence, systematic identification of accomplishments, 
special misans of collecting outside judgments, or whatever procedures 
maybe requirfd to obtain information that is relevant to the specific train- 
ing objectives considered important. Some recent work by Reilly (1 971) 
is a good example of progress along these lines. The notion of multiple 
criteria related to different training objectives has several advantages. 

First, it encourages the use of additional predictor variables 
which may not enhance prediction of. conventional criteria but are none- 
theless relevant to important aspects of success in some prograjTis. In 
this way it becomes feasible to dem,on5,-trate the validity of creeitivity tests, 
cognitive styles, or special accomplishments (see Frederiksen and Ward, 
1972, Witkin, 1972, and Wallach, 1972, for discussions of recent develop- 
ments in these areas). The simple reason is that specialized criteria can 
give such predictors something to shoot at. Using measures of this sort 
for selecting gi-aduate students has the very desirable effect of broadening 
the conception of talent. 

Second, the model depicted in Figure 2 is more likely to result in 
appropriate matching betv/ecn student characteristics and program char- 
acteristics than one would expect to occur under a single, aptitude - 
dominated mode of prediction. Improved matching should result in more 
student satisfaction and overall competency in a class of graduate students 

Third, the proposed view asfeumcs that prediction and selection 
are inseparable from program, design and evaluation. Consequently, the 
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process of defining an appropriate prediction strategy forces desirable 
,:^ttention to the intended outcomes of the program and the relationship 
'of the curriculum'to those outcomes. 



In summary, the available objective evidence suggests that the 
accuracy of jjrcdicting whr'cii students v/ill succeed in a particular grad- 
uate school is often no better than modest, especially if such predictions 
are based only upon a test or a grade record. Taken togeth -^ f ■ two 
types of predictors do a reasonably good job, considering .cted 
range of ability involved. The best way to improve selection of graduate 
students will be to develop improved criteria of success. This is no 
small job for graduate faculties, but it carries the promise of more ef- 
fective utili/,ation of talent and greater assurance of equity in admitting 
students lo advanced levels of training and the privilege associated with 
such programs. 



Tabic 1» Median Validity Coefficients^' for Various 
Predictors and Critoria of Success in Graduate School 



Cr iteria of Succo sjs 

Predictors Graduate Overall Dcpt, Attain Time 

CPA Faculty Exanu Ph.D. to 

Rni-inp, rh*D. 



1. 


GRE" Verbal 


.24 


.31 


.42 


.18 


.16 






It 6 


27 


5 


1*7 


18 


2. 


GRE-Quantitatlve 


.23 


.27 


.27 


.26 


.25 






i|3 




5 


hi 


18 


3. 


GRE-Advcxnced 


.30 


.30 


.48 


.35 


.34 




> 


2 b 


8 


2 


MO 


18 


4. 


GRE- Composite 


.33 


.41 


* 


.31 


.35 






30 


6 




53 


18 


5." 


Undergraduate GPA 


.31 


.37 


* 


.14 


.23 






26 


15 




30 




6. 


Recommendations 


* 


ft 


* 


.18 


.23 












15 


9 


7. 


GRE-GPA Composite 


.45 






.40 


.40 




(weighted) 


2k 






16 


9 



The lower number in each pair (set in smaller type) represents the number 
of coefficients upon which each median is based 



No data available 
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